-
Couldn't load subscription status.
- Fork 64
chunk prefill #498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
chunk prefill #498
Conversation
|
@rolandschulz @tdeng5 @jiyang1011 please review |
applications/flash_attention_v2/collective/xe_flash_attn_chunk_prefill_epilogue.hpp
Outdated
Show resolved
Hide resolved
applications/flash_attention_v2/collective/xe_flash_attn_chunk_prefill_epilogue.hpp
Outdated
Show resolved
Hide resolved
0505aed to
8c5d3ce
Compare
|
@rolandschulz @tdeng5 @jiyang1011 please review |
applications/flash_attention_v2/kernel/tile_scheduler_chunk_prefill.hpp
Outdated
Show resolved
Hide resolved
5490cb9 to
cbddd11
Compare
|
@rolandschulz pls review |
This change imports `SYCLCompat` to cutlass-sycl repo as `compat`. Previous dependencies on `syclcompat` are changed to `compat`. This PR also fix some failures of `SYCLCompat` in oneapi 2025.2. --------- Co-authored-by: Roland Schulz <[email protected]>
b92119c to
8048471
Compare
|
@Antonyvance pls review again~ |
|
@sunjiweiswift I believe this need to be reimplemented based on this PR 547. Would you be able to adopt? |
Can I merge it first? The sglang-xpu already uses this kernel. However, thirdparty is currently my forked repo, so I can't use the public repo. The new API will be available after the 547 merge. I will adapt and modify it in the new PR. |
550bc09 to
537afe5
Compare
537afe5 to
8809894
Compare
Missing features
2. Slide window -- done